Temporal-Semantic Clustering of Newspaper Articles for Event Detection

نویسندگان

  • Aurora Pons-Porrata
  • Rafael Berlanga Llavori
  • José Ruiz-Shulcloper
چکیده

In this paper we introduce a new clustering algorithm for event detection in newspaper articles, which has two main features. Firstly, it makes use of the temporal references extracted from the document texts to define the document similarity function. Secondly, the algorithm works hierarchically. In the first level, documents with a high temporal-semantic similarity are grouped into individual events by applying the proposed similarity functions. In the next levels, these events are successively grouped so that more complex events and topics can be identified. The resulting hierarchy describes the structure of topics and events taking into account their temporal occurrence. These tasks cannot be currently accomplished by current Topic Detection and systems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Building a Hierarchy of Events and Topics for Newspaper Digital Libraries

In this paper we propose an incremental hierarchical clustering algorithm for on-line event detection. This algorithm is applied to a set of newspaper articles in order to discover the structure of topics and events that they describe. In the first level, articles with a high temporal-semantic similarity are clustered together into events. In the next levels of the hierarchy, these events are s...

متن کامل

Detecting Events and Topics by Using Temporal References

In this paper we propose an incremental clustering algorithm for event detection, which makes use of the temporal references in the text of newspaper articles. This algorithm is hierarchically applied to a set of articles in order to discover the structure of topics and events that they describe. In the first level, documents with a high temporal-semantic similarity are clustered together into ...

متن کامل

Semantics-driven Event Clustering in Twitter Feeds

Detecting events using social media such as Twitter has many useful applications in real-life situations. Many algorithms which all use di↵erent information sources—either textual, temporal, geographic or community features—have been developed to achieve this task. Semantic information is often added at the end of the event detection to classify events into semantic topics. But semantic informa...

متن کامل

th Workshop on Making Sense of Microposts ( # Microposts 2015 ) Big things

Detecting events using social media such as Twitter has many useful applications in real-life situations. Many algorithms which all use different information sources—either textual, temporal, geographic or community features—have been developed to achieve this task. Semantic information is often added at the end of the event detection to classify events into semantic topics. But semantic inform...

متن کامل

Textual Article Clustering in Newspaper Pages

In the analysis of a newspaper page an important step is the clustering of various text blocks into logical units, i.e., into articles. We propose three algorithms based on text processing techniques to cluster articles in newspaper pages. Based on the complexity of the three algorithms and experiment on actual pages from the Italian newspaper L’Adige, we select one of the algorithms as the pre...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002